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Abstract 

We introduce a new oriented evolving graph model inspired by bi- 
ological networks. A node is added at each time step and is connected 
to the rest of the graph by random oriented edges emerging from older 
nodes. This leads to a statistical asymmetry between incoming and 
outgoing edges. We show that the model exhibits a percolation transi- 
tion and discuss its universality. Below the threshold, the distribution 
of component sizes decreases algebraically with a continuously vary- 
ing exponent depending on the average connectivity. We prove that 
the transition is of infinite order by deriving the exact asymptotic 
formula for the size of the giant component close to the threshold. 
We also present a thorough analysis of aging properties. We compute 
local-in-time profiles for the components of finite size and for the giant 
component, showing in particular that the giant component is always 
dense among the oldest nodes but invades only an exponentially small 
fraction of the young nodes close to the threshold. 



1 Motivations and results. 

Evolving random graphs have recently attracted attention, see e.g. refs 
El IE] and references therein. This interest is mainly motivated by concrete 
problems related to the structure of communication or biological networks. 
Experimental data are now available in many contexts PJ Hj . 
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In these examples, the asymmetry and the evolving nature of the net- 
works are likely to be important ingredients for deciphering their statistical 
properties. It is however far from obvious to find solvable cases that would 
possibly account for some relevant features of, say, the regulating network of 
a genome. Although biology has strongly influenced our interest in evolving 
networks, the model we solve is not based on realistic biological facts but it 
nevertheless incorporates asymmetry and chronological order. Understand- 
ing such simple evolving graphs may help understanding biological networks, 
at least by comparison and opposition. 

We were initially motivated by the study of the yeast genetic regulatory 
network presented in ref. [7j. The authors studied in and out degree distri- 
butions and discovered a strong asymmetry: a single gene may participate 
to the regulation of many other genes - the law for out-degrees seems to be 
large -, but each genes is only regulated by a few other genes - the law for 
in-degrees seems to have finite moments. This is why we consider oriented 
evolving random graphs in the sequel. A biological interpretation for the 
asymmetry is that the few promoter-repressor sites for each gene bind only 
to specific proteins, but that along the genome many promoter-repressor sites 
are homologous. However, this does not predict the precise laws. An under- 
standing of the same features from a purely probabilistic viewpoint would be 
desirable as well. 

The recent experimental studies dealt with global statistical properties of 
evolving graphs, i.e. when the evolving network is observed at some fixed time 
with the ages of different vertices and edges not taken into account. There are 
simple experimental reasons for that : to keep track of the ages would in many 
cases dramatically reduce the statistics, and in other cases this information 
is even not available. Our second motivation is a better understanding of the 
local-in-time statistical properties of evolving networks. This helps dating or 
assigning likely ages to different structures of the networks. As we shall later 
see, the global analysis, which is like a time average, gives a distorted view 
of the real structure of the networks. We shall present a detailed analysis of 
local-in-time features in our model. 

The model we study is the natural evolving cousin of the famous Erdos- 
Renyi random graphs Starting from a single vertex at time 1, a new 
vertex is created at each time step - so that at time t, the size of the system, 
i.e. the number of vertices, is t -, and new oriented edges are created with 
specified probabilistic rules. A tunable parameter a ranging from to oo 
describes asymptotically the average number of incoming edges on a vertex. 
Precise definitions are given in the next section. 

Our main results are the following : 
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From very simple rules, we see an asymmetry emerging. The global in 
and out degree distributions are different. We also compute the local profiles 
of in and out degree distributions, and comment on the differences. 

We make a detailed global analysis for the structure and sizes of the 
connected components. We use generating function methods to write down 
a differential equation that implies recursion relations for the distribution of 
component sizes, see eqs. (jllll4j) . 

A salient global feature of the model is a percolation phase transition at a 
critical value of the average connectivity. Below this value, no single compo- 
nent contains a finite fraction of the sites in the thermodynamic limit, i.e. in 
the large t limit. However, a slightly unusual situation occurs in that below 
the transition the system contains components whose sizes scale like a power 
of the total size of the graph, see eq. (ffi?j) . Correspondingly, the probability 
distribution for component sizes has an algebraic queue, see eq. lEHj) . and its 
number of finite moments jumps at specific values of the average connectiv- 
ity. Above the transition, this probability distribution becomes defective, but 
its decrease is exponential, see eq. (|3^j) . The transition is continuous. Close 
to the threshold, the fraction of sites in the giant component - the perco- 
lation cluster - has an essential singularity, see eq. lpTTj) . We argue that this 
result is universal, with the meaning used in the study of critical phenom- 
ena. The essential singularity at the percolation threshold had already been 
observed numerically by |U] in a different model which we show to be in the 
same universality class as ours for the percolation transition, and computed 
analytically for another class of models in [I]. 

We then turn to the study of local-in-time profiles of connected compo- 
nents. Guided by a direct enumeration based on tree combinatorics, we show 
that they satisfy recursion relations, and we give the first few profiles (iso- 
lated vertices, pairs, triples) explicitly. The profile of the giant component 
is given by a differential equation, from which we extract the singularity in 
the far past and the critical singularity in the present - see eqs (l5()|51|) . In 
particular the giant component invades all the time slices of the graph above 
the transition. One strange feature of profiles, which would deserve a good 
explanation, is that in several instances the formal parameter involved in 
generating functions for global quantities is simply traded for the relative 
age to obtain interesting local-in-time observables, see eqs. ()48|52jl . 

We have compared our analytical results with numerical simulation when- 
ever possible. 

While polishing this paper, we became aware of [B], whose goals overlap 
partly with ours. When they can be compared, the results agree. 
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2 The model. 



We construct evolving random graphs with the following rules: 

(i) We consider a triangular array of independent random variables £ij, 1 < 
i < j, where £ it j takes value 1 with probability pj G [0, 1] and value with 
probability qj = 1 — pj . 

(ii) We start from the graph made of single vertex at initial time t — 1. At 
time t, t > 2, a vertex with label t is added together with the directed edges 
[j — > t] for which £j )t = 1. We shall often take the viewpoint that the (biased) 
coin tossings defining £ i>t are done at time t. 

We shall assume that p t ~ a/t at large time t, with a a parameter which 
we shall identify as half the average connectivity. This choice ensures the 
convergence of various distributions to stationary measures, most of them 
being independent of the precise values of the early probabilities. 

By construction all edges arriving at a given vertex are simultaneously 
created at the instant of creation of this vertex. As a consequence, these 
graphs are not only oriented but chronologically oriented - this is unrealistic 
from the biological viewpoint. 

3 Edge distributions. 

In this section we give the incoming and outgoing edge distributions. 

Let £j(t) be the number of incoming edges at the vertex j, and £j~(t) be the 

number of outgoing edges at this vertex at time t. 

Let %(£) be the number of vertices with k incoming edges, and v£(t) be the 
number of vertices with k outgoing edges at time t. 

We may look either at the edge distributions at a given vertex, or we may 
look at the edge distributions defined by gathering averaged histograms over 
whole graphs. The former are specified by their generating functions, 




where (• • •) denotes expectation value. It may depend on the specified vertex 
labeled by j. The latter is defined by the generating functions, 

vHz)^ \ E fafr)) ** = 7 E <^ ±w > (i) 

1 0<k<t l<j<t 

We remark that this global histogram distribution is the average of the local- 
in-time quantity (z e j Since at time t the total number of vertices is t, 
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Vt z {z) is properly normalized, V t ± (l) = 1, to define an averaged probabil- 
ity distribution function, independent of the vertices, for the incoming or 
outgoing edge variables £ ± : 

k 

Incoming vertices. The number of incoming edges £j{t) = J2i a t 
vertex j < t asymptotically possesses a Poisson distribution since 

(/7«) = Y[[ qj + zPj ] exp(a(z - 1)) 

i<j 

The convergence of this distribution justifies our choice of asymptotic prob- 
abilities pj ~ a/j. Only the vertices whose ages j scale with the age of the 
graph, i.e. with j/t — a fixed, < a < 1, give non trivial contributions at 
large time to the averaged histogram and 

Vf(z)=wexp(a(z-1)) (2) 

This expression may also be retrieved by looking at the evolution equation of 
Vf(z). Indeed, consider adding the new vertex at time t. Since the edges are 
oriented from older to younger vertices 1 , we have tVf(z) = (t — l)V t ~i(z) + 
(z e * ®) from the second definition in eq.Q. This is equivalent to 

tvr(z) = {t- i)vr^(z) + ( gt + zpy- 1 

As (q t + zpt) t ~ 1 — e"^™ 1 ) at large time, the stationary limit is given by eq. (J2J) • 
This yields a Poissonian distribution with probabilities 

Prob (£ - =fc) =e- a | r (3) 

Outgoing vertices. At a given vertex j < t, with j/t = a fixed, the 
number of outgoing edges £f(t) = J2i<t^j,i & t vertex j also have a Poisson 
distribution at large time, 

= Y[ [ft + z Pi] -t^oo exp(-alog(x(z - 1)) 

at=j <i<t 

but with a parameter alog(l/cr) depending on the age of the vertex. Ap- 
proximating at large time the sum over j in eq. Q by an integral over a gives 
the histogram distribution: 

V t + W =W /* da (zfr{t)) = — -i r (4) 

Jo 1 + a(l — z) 

1 A vertex is older than another if it appeared before in the evolution, i.e. if it corre- 
sponds to a smaller value of a. 
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As for incoming vertices, this formula follows from the evolution equation 
for Vt(z). Indeed, since the numbers of outgoing edges if(t) from vertex j 

at time t and t — 1 differ by £j )t we have (z e j ^) = (z e J (z e i- t ). From 
definition ((H) this gives 

tV t ~(z) = 1 + (t - l)Vr-i(z) (qt + zpt) 

where the first term is the contribution of the newly added vertex at time 
t. The stationary limit is given by eq.(0J). This is a geometric distribution, 
slightly larger than the Poisson distribution, with probabilities 

Pro V=fc) = (1 + Q)fc+1 (5) 

Mixed distribution. Let be the number of vertices with k + out- 

going and k- incoming edges at time t. As in eq.(jlj, the generating function 
for the mixed histogram distribution is defined by 

V t (z + ,z-) = - £ (v k+ , k _(t))z k + +z k S = - E (zi {t) zl {t) ) 

k+,k- l<j<t 

By construction the outgoing and incoming edges variables if it) are statis- 
tically independent for j fixed, so that the last expectation values factorize. 
As above we may derive an evolution equation by evaluating the contribution 
of the newly added vertex at time t. This yields: 

tV t (z+, z_) = (q t + z^ptf- 1 + (t- l)Vt-i(* + , z-) (q t + z+pt) 
Its stationary limit is factorized: 

e a(z--l) 
V{Z+,Z-) = — r 

1 + a(l - z + ) 

Outgoing and incoming edges are statistically independent at large time. 



4 Cluster distributions. 

In this section, we present the main relations governing the probability distri- 
butions of connected components of the graphs. Two vertices belong to the 
same connected component if they can be joined by a path made of edges, 
without any reference to orientation. This definition ensures that the prop- 
erty of being in the same connected component is an equivalence relation, 
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but does not ensure that two points in a connected component can be joined 
by an oriented path. 

To partly avoid repetitions, the term cluster is used as a synonymous for 
connected component in the sequel. 

Intuitively, the fact that the network is fragmented can be understood 
as follows : when a vertex £ is created, it has a finite probability to be 
isolated, and the probability that none of the vertices t + 1, ■ ■ • , t connects 
to vertex to is qt +i ■ • ■ qt which scales as (to/t) a . This quantity remains finite 
as long as to/t does. This argument shows that there are isolated vertices 
in the system. A small extension of the argument shows that there are also 
finite components and that young vertices are more likely to be in small 
components than old ones. This will be made more rigourous in the study 
of profiles, see section |H1 

Let Nk(t) be the number of connected components with k vertices at time 
t and let N t (z) be the generating function, 

N t (z) = J2N k (t) z k 

k>l 

By definition, J2k Nk(t) is the number of components and J2k N k (t)k the total 
number of vertices, J2 k N k (t)k = t at any finite time. 

Let us write an evolution equation for N t (z). At time t + 1, we add the 
vertex with label t + 1 which may then be connected to n k (t) connected 
components of size k. This creates a new component of size 1 + J2k n k(t)k, 
but also removes n k (t) components of size k. Thus, at time t + 1 we have: 

N k (t + 1) = N k (t) - n k (t) + 5 k;1+EpMt)p (6) 
with 5j- k the Kronecker symbol. Alternatively, 

N t+1 (z) = N t (z) - Mt)z k + z ^E k Mt)k (7) 

k>\ 

As is apparent from this formulation, the transition probability from a 
given N t (z) to a given N t+ i(z) can be given in closed form. To be precise, the 
admissible A^(z)'s (describing the accessible distributions of components at 
time t) are polynomials with integral non-negative coefficients, whose deriva- 
tive at z — 1 have value t. Now suppose N t (z) and ^4-1(2;) are admissi- 
ble. If the difference N t+ i(z) — N t (z) cannot be written as — J2k>i n k (t)z k + 
Z 1 +Y,k rik ( t ' >k for some set of nonnegative integers n k {t) } the transition is for- 
bidden. If it can, then the n k (ty§ are uniquely defined and the transition 
probability is 

Prob(iV,(z) - iV t+1 (,)) = n (^^^^(l - ?m) nfcW 
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The meaning of this equation is simple. At time t+1, the new vertex is added, 
and for each of the former t points a (biased) coin is tossed to decide the value 
of the edge variables £j,t+i- The tossings are independent with the same law, 
so the probability that the new point does not attach to a given component of 
size k is <? f fc +1 , and distinct components are independent. Hence for each k one 
makes N^(t) independent Bernoulli trials with failure probability and 
the transition from N t (z) — > N t+1 (z) requires exactly rikit) successes. This 
shows that the graph evolution is a (time inhomogeneous) Markov process 
on the space of components distributions, a fact that we shall use for the 
purpose of numerical simulations. 

This explicit representation of the transition probability could be used to 
average equation (j7J). Alternatively, one can represent the number nk(t) of 
components of size k which are connected to the new vertex in terms of the 
edge variables 1^ as 

N k (t) 

n k {t)= 53 [l- na-w)] 

[*]=1 

where [k] runs over connected components of size k. Since the edge variables 
£j,t+i are statistically independent of the earlier edge variables, £j t f- with k <t, 
and therefore also independent of the i\Tfc(£)'s, we have for any w, 

K i(1) ) = (kV(i-?U< t(i) ) (8) 

In particular, 

(n k (t)) = (1 - q* +1 ) {N h (t)) (9) 
We can now take the average value of eq.(J7|) to get 

(N t+1 ( Z )) = (^(^ m )}+^nfe+i+( i -<i)^] jVfe(t) ) 

fc>i 

= (N t (z qt+1 )) + z q i +l (]l[i + ( q ; + \-i)z k } N ^) (10) 

k>l 

Assume, but this will be justified later using the asymptotic behavior 
of the Pj's, that at large time and for fixed component size Nj-(t)/t is self- 
averaging, meaning that 

N k (t)/t ~ C k + o(l), ast-^oo 

with C k equals to its averaged value with probability one and with the remain- 
ing other o(l) terms random. This in particular implies that for any finite 
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size k the number of connected components of size k scales thermodynami- 
cally with the graph size. If both sides of eq.(fTUJ) are expended in powers of 
z, a given degree involves only components of bounded (t-independent) size. 
So, order by order in z, self-averaging applies and we conclude that taking 
1 — Qt ~ oc/t, the following is an accurate approximation for the last term in 
eq. (|TU|) at large times: 

ql+i II [1 + (C+i - l)z k ] Nk(t) exp(-a + a £ kz k C k ). 

k>l k>l 

The averaged evolution equation fjlOj) then gives a deterministic differential 
equation 

= -CO) - azd z C{z) + zexp(-a + a^C^)) (11) 
for the generating function 

c(z) = J2^c k . 

k>\ 

The function 

zd z C(z) = £ z fe P fc , P fe = kC k 

k>l 

has a direct probabilistic interpretation. Indeed, P k is the fraction of points 
in clusters of size k, or equivalently the probability that a randomly chosen 
vertex belongs to a connected component of size k. 

By construction C(z) is a Taylor series with positive coefficients. The 
series J2 k = J2k Pk is convergent because it counts the fraction of points 
in finite clusters which is < 1. As a consequence the radius of convergence 
of C(z) is at least 1: if we denote by R(a) the radius of convergence C(z), 
we know that R(a) > 1 for < a < +oo. 

We show in Appendix A that analogous methods can be used to describe 
the connected components of the Erdos-Renyi random graph model. 

Most of our subsequent analysis will be based on (jllj) . It recursively 
determines the C k s. The first few are: 

e~ a ae~ 2a 
Cl = (a + 1)' ° 2 = (a + l)(2a + l)' (12) 



C, 



a 2 (Qa + 5)e" 



3a 



2(a+ l) 2 (2a + l)(3a + 1)' 

More generally, the combinations C k e ha are rational functions in a. They are 
more efficiently computed using a second order differential equation which 
follows from derivation of the logarithm of eq. |TT| : 



zd z {C + azd z C) = (C + azd z C){\ + a{zd z ) 2 C) (13) 
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This leads to 



(n - + na)C n = £ k 2 



(1 + al)C k d 



(14) 



k+l=n 



A general feature of the recursion relations is that the rational function Cke ka 
has no poles except possibly at the points — 1, —1/2, ■ ■ ■ ,—1/k. 

Eq. ffTU|) is not a closed formula for the probability distributions of the 
number of connected components. But as shown in Appendix B, one may im- 
prove slightly the above argument to obtain a closed system (see eqE3J) which 
can be used to prove systematically that the variables Ck are self-averaging, 
a task we perform for G\ and Ci in Appendix B. The proof becomes more 
and more tedious for large k. But we are confident that the self averaging 
property is general because of the close relationship with the Erdos-Renyi 
model presented in Appendix A. 

5 Percolation transition. 

Up to now, our analysis of connected components has always concentrated 
on finite components: the thermodynamic limit t — > oo has been taken while 
keeping k, the component size, arbitrary but fixed. We have argued that 
in this regime, the number of components of size k is proportional to t. 
This has lead to a satisfactory description of Ck- Our arguments did not 
use any hypothesis on whether or not only finite components play a role 
in the thermodynamic limit. However, as we observed already, the sum 
J2k Pk measures exactly the fraction of the sites that are either isolated or 
in components of size 2 or 3 or • • •. To rephrase it more vaguely, J2kPk 
counts the fraction of vertices that belong to finite connected components. 
If only clusters of finite size contribute to the thermodynamic limit, then 
HkPk — 1- Else, by standard physical arguments, a single giant component 
of size ~ t(l — J2k Pk) accounts for the deficit. The giant component is also 
called the percolating cluster. Its relative size, which we denote by P^, is 



To discriminate between the two situations, may be computed nu- 
merically by evaluating a large number of coefficients Pk using the recursion 
relation (|14p. As we shall see later, the convergence of the series is slow 
for the whole relevant range of a. The result of such a partial summation 
J2k<k max Pk is plotted in FigHJfor k max = 2 11 , • • ■ , 2 15 . It reveals a phase tran- 
sition at a value a c between .24 and .29, going from a regime where finite 



P^ = 1 - J2 Pk = 1 - d z C(l) 



(15) 
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0.24 0.26 0.28 0.3 0.22 0.24 0.26 0.28 0.3 



Figure 1: Partial resummations of P k and kP k . Left: The fraction of sites 
occupied by the giant component. Right: The variation of J2 kP k close to the 
transition. 



components contain all vertices to a regime where they do not. Below .24 
and above .29, the plots corresponding to different values of k max are hard to 
distinguish, but in the transition region, large values of k make substantial 
contributions to ^2 P k . The transition is also manifest on an analogous study 
of J2 kPk- We shall show later that 1/4 is the exact threshold in this model 
and that J2 kP k is discontinuous at the transition. 

The growth of P^ just above the threshold seems to start with many 
vanishing derivatives, in strong contrast with what happens in the Erdos- 
Renyi random graph, for which the growth of the giant component is linear 
close to the transition. This can be related to the following observations: 

- As we have recalled in Appendix A, in the Erdos-Renyi model the 
components of size k occupy a fraction ^-g-a k ~ 1 e~ ka . As a function of a, 
this fraction has a single maximum at a = These values accumulate 
at a — 1~, the well-known transition point for the standard random graph. 
Then for a > 1, the fraction of sites occupied by components of size k 
decreases with a finite slope for all fc's, and so does the sum, so that the 
growth of the giant component is linear close to the transition. 

- In the model studied in this paper, the behavior of Pkipt) as a function of 
a for generic k is not so easy to get at. However, a simple numerical analysis 
leads to the following picture: Pi is a decreasing function of a, but for k > 1, 
Pk has a single maximum, at say a k . This sequence starts with a.2 — -241, 
a 3 ~ .311 0:4 ~ .341, is maximum for k — 12 with a 12 — .375 and then 
decreases very slowly (ctioo — -338, «iooo — -301, aioooo — -282), apparently 
getting closer and closer to 1/4. At the transition, most Pfc(a)'s are still 
increasing, and close enough to the transition a finite but large number of 
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them is still increasing. So subtle compensation mechanisms can take place, 
leading possibly to the vanishing of (infinitely) many derivatives of at 
a = 1/4. 

This is confirmed in the following subsections, which are also devoted to 
a more precise description of the distributions of finite and infinite clusters. 
To summarize: 

Poo = 0, if a<l/4, 
Poo > 0, if a > 1/4. 

6 Behavior below the transition, a < a c . 

We turn to the examination of the consequences of eq. lfTT]) . the equation that 
determines the generating function for the number of clusters of given finite 
size. 

We know that R(a), the radius of convergence of C(z), is at least 1. To 
analyze the behavior of C(z) around z — 1, we define 

F(t) = a-r -ad T Y(r), Y(r) = C(e T ). (16) 

From eq. (jllj) F satisfies 

a(l-e- F )d T F = -F-r (17) 

Below the transition, there is no percolating cluster so that d T Y(0) = 
J2k = 1 or alternatively, 

£P* = 1, a < 1/4 (18) 

This makes clear that the normalized positive numbers are the probabil- 
ities for a vertex to be in a connected component of size k. For F(r) this 
translates into the boundary condition 

F(0) =0, a < 1/4 

6.1 Scaling laws. 

We first look for a formal solution F tr (r) of eq.(|17J) in the form of a Taylor 
series in r: 

F tr (r) = -r - a £ (19) 
rrv ml 

m>l 
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The Taylor coefficients are the moments of the measure \x m = J2kk m Pk- 
For example Hi is the average proportion of vertices per component. As 
eq. (fT3|) the differential equation eq.fJHJ) may be turned into a second order 
differential equation for Y(r), 

d T Y + ad 2 T Y = (Y + ad T Y) (1 + ad 2 Y) (20) 

Eq.jlH) with d T Y(0) = 1 gives Y(0) = 1 - a. Eq.flHJ) then allows us to 
recursively compute the /i m 's: 

4 1 - 2a - VT^I^ 

* = (TTf = 2^ (21) 

hn - 1 - (n + l)b)fj, n = -(1 + a/ii)/i n _i - a ( Mfa-i + afi k ) 

k,l>2 

with 



4a = 1 - b 2 , b= VI -4a 

The square root singularity in the expression of /ii indicates that the initial 
boundary condition (fTSj) becomes pathological at a = 1/4 and thus signals 
the percolation transition. The recursion relation for the higher Taylor co- 
efficients shows that fi n possesses a pole at b — (n + l)/(n — 1). It actually 
changes sign, from positive to negative, across the pole. As a function of n, 
the /x n 's have a simple pole at n*(6), 

"•"^w- "• (i) = T ± i- n£2 (22 > 

where the numerator does not vanish at (n — 1) = 6(n + 1). Since /x n = 
Ylkk n Pk at least for n < n*(6), this simple pole codes for the asymptotic 
behavior of P& at large fe. Indeed, recall that if P^ ~ k~ u as — > oo then 
X)fc ^ Z -Pfc diverges as x — > v — 1 from below with a simple pole at this value. 
Thus assuming that we may extend the simple pole (|22J) to non-integers value 
of n we learn that 

P k const. kr v{fi) (23) 

with 



1 + /i _ 4^ 

i/o = l+n,(6) = ^ (24) 

la 
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This means in particular that the probability distribution (which describes 
the proportion of the system occupied by clusters of size k) is large and only 
the moments /i n for n < n*(6), exist. 

The fact that the Taylor coefficients p, n , computed from the recursion 
relation (|21jl , cannot consistently be interpreted as moments of a probability 
distribution for n > n*(6) indicates that the expansion (fT^j) is only up to 
o(r"*) terms. Indeed, the differential equation (jl7j) is compatible with an 
expansion of F(t) for r < 0, z = e T < 1, of the form: 

F(r) = -T-a £ t±y P A-T) 9+pMh) , 

q,p>0 

with yo,o = and yo,m = A*m- As a function of the complex variable z, d z C(z) 
has thus a branch cut starting at z = 1: 

<9 z C(z) ~ const. (1 - z) n * (6) + • • • , around z = 1 (25) 

For n*(6) an integer, this formula should becomes d z C(z) ~ const. (1 — 
2) n *wiog(l — z) + '". The cut implies the asymptotic behavior (|2H|) for 

^ = * i w*)f *■ 

At the transition but from below, i.e. a = 1/4 , there are logarith- 
mic correction to the scaling behavior (|23|) . and the branch cut equation is 
d z C(z) ~ const. (1 — z)/ log(l — £) + •••. This ensures that the first moment 
is finite, and its value, computed from (J2H), is jUi| a =i/4- = 4 

6.2 Scaling domain, a < a c . 

The scaling law (|23|) may be linked to the typical growth rate of large clusters 
in the system. For concreteness, consider the component of vertex t' for any 
given t'. For very large t, the number of arrows emerging from vertex t' grows 
like a log t. Then we infer that the size of the genealogical tree of t' will grow 
like t a (under the hypothesis that the genealogical tree is indeed tree-like, 
a reasonable assumption for small a). This counting of descendants gives a 
crude lower bound for the size of the connected component of t'. Hence we 
expect that the system contains components whose sizes grow like a power 
of t. To estimate this power, we argue as follows. 

Consider a given large cluster of size k(t) <C £ at time t: 

i) k(t + 1) — k(t) is with probability and 1 + Yj P P n p{t) with proba- 
bility 1 — qt+l ~ ak(t)/t times the probability that vertex t + 1 connects to 
n p (t) clusters of size p apart from the large cluster; 

ii) removing the given large cluster does not change the thermodynamical 
properties of the graph, so the probability that vertex t + 1 connects to n p (t) 
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clusters of size p apart from the large cluster is simply the probability that 
vertex t + 1 connects to n p (t) clusters of size p. Hence for large t, from eq.([SJ), 

(E P pnp(t)) - «Efe kP k = ol\l\\ 

hi) suppose we add 5t new vertices with 5t <C t but k(t)5t 3> t. Between 
time t and time t + 6t, many new clusters have be connected to the given 
large components so 1 k(t + 5t) — k(t) <C k{t), but this has not changed 
the thermodynamical properties of the graph. Hence we can average the 
equation in i) to get a deterministic equation 

k{t + St) - k{t) ~ ak(t)(l + a^St/t. 

This leads to 

k{t) ~ t 1/u with l/v = a(l + a/ii) = 2a/(l + b). (26) 

We find that the growth rate of large clusters is universal. As expected, 
their growth exponent is larger than a, the genealogical tree growth, because 
it takes not only descendants into account but the whole component. The 
difference is maximum at the transition, where v = 2. 

The fact that the same exponent, v, governs the asymptotic behavior of Pk 
at large k and the size of large clusters for large t can be understood directly 
as follows. First consider one realization of the random graph for a given t 
and suppose that there is a single component of maximal size, say K . Observe 
that J2k>i kN k it) is by definition the number of points in components of size 
larger than /. This number is strictly larger than I if I < K, but vanishes if 
I > K. So K is characterized by the identity J2k>K kNkit) = K. 

From this we infer by taking the average that for large t the relation 

£ k(N k (t)) « k[t) 

k>k(t) 

gives a sensible characterization for kit), the order of magnitude of the 
size of large components in the graph. We write J2k>k(t) k(Nk(t)) = t — 
J2k<k(t) k(N k (t)) and use that for a < 1/4, Efc-Pfc = 1 to write J2k>k(t) tp k + 
J2k<k(t)tPk ~ k(Nk(t)) ~ kit). For large k(t), the asymptotics of the first 
sum is J2k>k(t) tPk ~ tkit)~ u+1 . The second sum is made of finite size correc- 
tions. If we assume that these that these are not too large, we conclude that 
tkit)~ u+1 ss kit), i.e. that k(t) ss t^ v . 

In fact, experience from finite size scaling suggests that the two sums 
give contributions of the same order of magnitude. The idea is that tPk is 
the main contribution to k(Nk(t)) not only when tPk is of order t, but even 
when simply tPk 3> 1 so that self averaging remains valid. This means that 
tPk — k(Nk(t)) and k(Nk{t)) become of the same order of magnitude only 
when k is so large that tPk ~ 1. So again we see that kit) is characterized 
by tP k {t) ~ 1, and we conjecture that tk~ u is a scaling variable. 
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7 Behavior above the transition, a > a c . 



Above the transition there is an giant component. Let be its relative 
size. By definition, d T Y(0) = 1 — or alternatively 

X)P fc + Poo = l, «>l/4 (27) 

This makes clear that the P^s and P^ define the probability distribution of 
vertices among the clusters of different sizes, with the probability for a 
vertex to be in the percolating cluster. 

The size of the giant component koo increases linearly with time : k^ ~ 
tPoo for t large. The slope may be evaluated as follows. Imagine adding 
a new vertex at time t + 1. It is connected to the percolating cluster with 
probability 1 — so that 

k^t + 1) ~ (1 - g*ft) (koo(t) + 1 + EP n *>(0) + M*) 

p 

where, as in eq.(|B|L n p (t) are the numbers of components of size p connected 
to the new vertex. This can be rearranged as k^t + 1) — koo(t) ~ (1 — 
Qt+i^)(^+J2 P P n p(t))- The quantity P^ is self averaging, so using q t ~ 1 — a/t 
and ^ — tPoo at large time we infer P^ = (l—e^ aP °°)((l+J2 P pn P it))) which 
by use of eq.© leads to 

P 00 = (l-e- aP ° )(l + a/i 1 ) (28) 

As shown below the above equation is exact, but it does not determine P^ 
as this requires knowing the moment 

Above the transition, F(r) still satisfies the differential equation (|17J) but 
with a different boundary condition: 

F{0) = aP oo , a>l/4 (29) 

with vanishing at the transition i.e. as a — > l/4 + . This modifies the 
behavior of its moments. So let us expand F(t) in Taylor series around the 
origin: 

with ji m = J2k k m Pk- Eq. flTTj) implies eq.(j2Hl) which is thus exact. The second 
order differential equation ()2())1 does not fix P m but determines recursively 
the Urn's which depend parametrically on P^. 
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It remains to decipher what the behavior of the size of the giant compo- 
nent is, at least close to the transition. This will follow from an analysis of 
the behavior of F(t) close to its singularities. As we are going to show, F(t) 
possesses square root branch cut at a point r c > but exponentially close 
to the origin. This means that above the transition the singularity of the 
function d z C(z) is at z c = e Tc > 1, with z c depending on a, a behavior which 
has to be compared with the cut at z — 1 below the transition, eq.(|2*5|l. More 
precisely, we show that 

log Poo tt//3, T c ^Pj%e for a -1/4+ (30) 

with (3 defined by 

4a = P 2 + 1, p= V4a- 1 . 
7.1 Preliminaries. 

To prove eq. ()3()j) we shall look at the behavior of F in the neighborhood of 
three different points: 

- at the origin r = where F(t) takes the boundary value 

- at the branch point r c > with F(t c ) = 0, d T F{r c ) = oo and, 

- around the point < specified by the condition Ffa) + 2r^ = 0. 

Let us first show r c and exist. The function we are interested in is 
defined by F = — r + a(l — J2k>i Pk^ kT ), and satisfies the differential equation 
eq. ffTTj) . So d T F < and F decreases from +oo at r = — oo to at a point r c 
which is non-negative since F(0) > 0. F also satisfies the obvious inequalities 

< F + r < a for r < and d T 2 F < 0. So there is a single point Ta, with 
—a < Td < 0, at which Ffa) + 2r^ = and F + 2r has the sign of r — r^. 
Both r c and vanish at the transition, (3 — > 0. 

It turns out that most of these properties could be proved using only the 
differential equation. More precisely, take any solution E of ()17)) with an 
initial condition at r» < such that P(Tj) + Tj > 0. The function E can be 
extended on the right as a positive function on a maximal interval [Tj,T/[. 
Then E is strictly decreasing for r > Tj, lim T - = 0, > 0, and ry > if 

a > 1/4. This discussion emphasizes the intrinsic role played by a = 1/4 in 
this problem. 

Before proving eq.([30|) let us get intuition from a simpler, but more uni- 
versal, version of the differential equation (|17|). If we look at a region where F 
is small - and we know that such regions exist generically - we can estimate 

1 — e~ F by F ~ / with / approximating F . This leads to the simplified, but 
still non-linear, differential equation 

afd T f + r + f = 0. (31) 
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It turns out that this simpler equation can be solved in closed form. Before 
doing that, let us further assume that r is close to r c so that / can be 
neglected compared to r. Then we get a f d T f ~ — r, with solution 

/ - \/(T^-T 2 )/a, 

leading to the announced square root singularity. But this does not give 
informations on the location of the branch point. 

Consider now the following functional of an arbitrary function / of r 

- log(a/ 2 + rf + r 2 ) - - arctan 

To fix conventions, we specify the function arctan by demanding that it is 
continuous and takes value in ] — 7r/2,7r/2[. The total derivative of this 
functional with respect to r is 

afd T f + r + f 
af 2 + rf + t 2 

It vanishes if / is a solution of eq. (f3~T]) . so we have indeed solved in closed 
form equation (fTTj) for small F. As this is the domain we are interested 
in, it is tempting to argue that F and / should exhibit the same singular 
behaviour. This turns out to be true, but there are some subtleties because 
the limit a — > l/4 + is singular: the size of the domain for which / is a 
uniform approximation to F shrinks to 0. 



7^ ' <*> 



7.2 Rigourous estimates 

Our strategy is to use the invariant (jS2j) for the approximate equation (J3T|) 
to derive exact inequalities for F. 

Let us observe that the functional (}3*2*j) is singular at points where f+2r = 
where it has a jump of amplitude ±tt//3. With this in mind, we define the 
functional 

I(a',F(r),r) = \ log(c/F 2 + tF + r 2 ) (33) 

!jr arctan (^|;) + f if r < r d < 
W i if T = Td 

jp arctan (Jq^) if r d < t < t c 

In this definition, a' = (1 + /3' 2 )/A > 1/4 is a priori independent of a, the 
value for which F(t) is considered. The functional I(a', F(t), t) is a smooth 
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function of r on ] — oo,r c [. Using the differential equation for F, its total 
derivative with respect to r is found to be 

dl _ t + F ( ct F 
~ a'F 2 + tF + t 2 V ~ 'a 7 1 - e~ F 

If a' = a, the right-hand side is always negative, so / is decreasing, and 
comparing its values at r < r d , at r d , at and at r c we find 

I{a,F{r),r) > \og{l3\r d \e^ lw ) > log^P^) - -^arctan/3 > logr c . (35) 

We know that r d > —a, so taking a fixed r < —1/4, we can take the limit 
a — > l/4 + . At point r, F is analytic in a, so /(a, F(t), t) + 7r//3 has a finite 
limit, and we get exponentially small upper-bounds for Ir^Poc and r c . For 
instance, \T d \(3e^l 2 ^ is bounded above when a — > l/4 + . 

For a' < a, ^ changes sign at a point r' solution of 1 — — x J^-f — 0. 
This point is unique as F is decreasing and goes from +oo at — oo to at r c . 
Let us choose a' such that t' < r d - and such choices exist. Then j- goes 
from negative to positive if r increases so that I increases for r varying from 
t' to t c . This leads to lower bounds 

I(a',F(r'),T') < \og(f3'\T d \e-^ w ') < \og{a' 1/2 aP^) - -^arctan/3' < logr c . 

(36) 

Take a! J a = (e 2l ~ d — l)/2r d , in such a way that 4*- vanishes exactly at r d . 

Then a' = a(l + r d + ■ ■ •), and (3' = (3(1 + r d /2(3 2 H ) when a -> 1/4+ 

Comparing the lower bounds with the upper bounds obtained above, we see 
that 

P\Td\e~* /2fi ~ Poo/8e ~ r c for a -> 1/4+. 

To get a lower bound for |rd| we take a' such that jr- vanishes at r' < r d . 

Then -i arctan (^) > and a'F 2 + rF + r 2 = /3 ,2 F 2 /4 + (r + F/2) 2 > 

(3' 2 F 2 /A so we have a crude bound I(a! , F(r , ),r') + tt//3' > log/3'F(r')/2. 
Taking for instance p = (3 - (3 2 , so that -> 1 and F(r') -> as /3 -> 0+. 
Reporting in £ = yields F ~ 4/5 3 . Thus, from (J3EJ), |r d |/5- 3 e 7r/2/3 is bounded 
below when a — > l/4 + . 

The above upper and lower bounds then imply 

21ogr d ~ logr c ~ logFoo ~ -7r//5 as (3 -> + . 

We restate the physically most important result: up to an algebraic pref- 
actor, the size of the percolation cluster close to percolation is 

Poo oc e - 7T/VI ^ 1 . (37) 
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In fact, we have obtained a better estimate. We expect that T d (3 l ^e 71 "/ 2 ^ 
as a finite limit for /3 — > + for a certain 7, or equivalently 

Poo ~ const /? 7 e"" /2/3 , /3 -> + . (38) 

We have proved that if 7 exists, < 7 < 4. The asymptotic behavior of the 
probabilities P& is not the same below and above the transition as C(z) does 
not have the same analytical properties on the two sides of the transition. 

Since above the transition the branch point is located at z c = exp r c > 
1, the Pfc's now decrease exponentially. More precisely, d z C(z) possesses a 
square root branch point at z c , 

d z C(z) = const. y/z c — z + ■ • ■ 

so that Pfc = § ^z~ k d z C(z) behave as 

P k -^00 const. k~ 3/2 z~ k ~ const. k~ 3/2 e~ kTc , (39) 

to be compared with eq. (}23|) . 

7.3 Universality. 

We give now a universality argument suggesting that 7 = and this is 
confirmed by solving numerically the differential equation ([17]). 

Our invocation of universality rests on eq.()39jl. in which r c , or equiva- 
lently (in the vicinity of the critical point) Poo/8e, controls the exponential 
decrease of the Pt's and plays the role of a mass gap. This mass gap is ex- 
ponentially small close to the transition, and by analogy we may argue that 
increasing (3 is a marginally relevant perturbation of the percolating criti- 
cal point. Introducing the Wilson-Callan-Symanzik beta function B(/3), we 
expect a relation 

r c ~ P^/Se ~ exp J — (40) 

Comparison with our formula gives B(f3) = j3 2 /it + B 3 /3 3 + ■ ■ -. It is known 
from field theory that the coefficient P3, which dictates the exponent of the 
algebraic prefactor, is universal. 

Such universal features are controlled by the continuum limit. In our 
framework, the continuum region is reached at small r, and small P, and 
we expect that the continuum limit is governed by ()31|). Note that both r c 
and \Td\ are exponentially small close to the transition, and F(r) remains 
small for r between these points. This a posteriori justifies looking at the 
approximate equation (|31jl. Consequently, as far as universal quantities are 
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concerned, the inequalities in can be replaced by equalities. This leads 
to 7 = 0, or B 3 = i.e. 

Poo ~ const e~* lw 

as announced. 

As universality could suggest, eq (|3*T|) turns out to describe the continuum 
limit for a larger class of evolving networks than just the specific one we 
are studying. This is the case for the model studied in 0. We refer to the 
original paper for the definitions. It suffices to say that eq.([T7|) is replaced 
by 

2SSd T S = -S - (e T -1). 

In this equation, 1 — S is a generating function, the coefficient of e kT giving 
the fraction of points in components of finite k, so S(0) is the fraction of 
sites occupied by the giant component. The parameter 6 is the average 
number of edges created at each time step, so this is is precisely the equivalent 
of our a. To study S close to r = 0, the approximation is e T — 1 ~ r, 
and we retrieve (|3ip. with a replaced by 25. The percolation threshold is 
5 = 1/8 (as expected, percolation thresholds are not universal), and the 
size of the infinite component behaves like e - n / 2 y / ^( 5 - 1 / 8 ) 1/2 . The prefactor 
n/2y/2 ~ 1.111 compares quite well with the numerical value 1.132 ± 0.008 
obtained in the original paper. This fact was already noted in [3]. 

7.4 Scaling regime, a > a c . 

We have accumulated evidence that the percolation phase transition is very 
similar to the Kosterlitz-Thouless phase transition in the XY model. For 
instance, the probabilities Pf. follow the scaling laws ()23|) below the transition 
with scaling exponents varying continuously with the parameter a while they 
decrease exponentially above the transition. 

The discussion of universality in the previous section suggests to look 
for scaling functions describing the neighborhood of the critical point. For 
instance, from eq. (J20J it follows that the moments /i m , m > 2, diverge as 
Poo -> with 

9m 

Hm ~ a ^l/4+ p^ZT, 01 = 12, 02 = 64, ••• 

oo 

In particular, \i\ is discontinuous at the transition, jumping from 4 to 12. 
These scaling relations imply that the function 

G(X) EE P^FixPoo) 
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has a finite limit, denoted by G c (x), as a — > l/4 + for any fixed x, 

G c (x) = a-x-aJ2 ( 41 ) 

m>l ' 

The differential equation (JT7|) at a = l/4 + then reduces to G c d x G c + 4(G C + 
x) = 0. With the boundary condition G c (0) = 1/4, this is integrated as: 

log(4(G c + 2a;)) + ^-^ = (42) 

The left-hand side is very reminiscent of the limit a — > 1/4 of eq. ()32j1 . As 
a consequence G c (x) has a square root branch point at x c = l/8e at which 
G c vanishes and g m /m\ ~ m~ 3//2 (8e) m for m large. Actually, a more precise 
computation based on the Lagrange formula presented in Appendix C yields 
the exact value of the scaling coefficients g m : 

g m+1 = m m 8 m+ \ m>l. (43) 



8 Chronological profiles 

One can make a direct counting of the average number of connected com- 
ponents in the random graph at time t which are copies of a given finite 
labeled graph. This will allow us to retrieve the results of section 0] But this 
also leads to a detailed local-in-time description of the connected components 
which illustrates the consequences of the chronological memory of our model. 

8.1 Tree distributions. 

Let G be a labeled graph with vertices 1, 2, • • • , k. We let rrij be the number 
of edges connecting vertex j to a vertex with smaller label, so that m = 
mi + • ■ • + is the number of edges of G. We look for the average number of 
increasing maps v from [1, • • • , k) to [1, • • • , t) such that the vertices V\, • • ■ , Vj. 
span a connected component of the random graph isomorphic to G. This 
number is the average number of connected components isomorphic to G in 
the random graph. 

By the rules of construction of the random graph, the probability that 
the vertices v\ , ■ ■ • , Vf- with 1 < v\ < • • • < • • • < t span a connected 
component of the random graph isomorphic to G is 

nfa-^xr 1 -" 11 n 4) 

1 = 1 \ Vi<Wi<V i+1 I 
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with the convention t> fc+1 = t + 1. 

The average we look for is obtained by summing this expression over the 
ViS. For large t, using the asymptotic behavior pj ~ a/j for large j, the sum 
can be formally reinterpreted as a Riemann sum, leading to a contribution 

j.k—m—ka„m / j _ „ct— m\ j _ „a— wife 

t e a / a <Ti ct 1 • • ■ a Ofc a fc 

J0<cr 1 <---<(7 fc <l 

In deriving this formula, we have not treated carefully the contribution of 
small values of the Wj's. This is reflected in the fact that the integral can be 
divergent if some m^s are too large. However, the prefactor t k ~ m is the sign 
that in this case the sum over the v^s is nevertheless negligible in the large 
t limit, as a more careful treatment would show. 

The most salient feature of this formula is that only connected graphs 
with k — m + 1 give a contribution proportional to t, i.e. contribute to Ck- 
Since k is the number of vertices and m the number of links, this relation 
characterizes trees as follows from the Euler formula: only trees contribute to 
the thermodynamic limit. A given tree on k vertices with incoming degrees 
rrii, i — 1, • • • , k gives a contribution 

-ka k-1 

(44) 

(a + 1 — mi) (2a + 2 — mi — m<i) • ■ ■ (ka + k — mi — ■ ■ ■ — rrik) 

to Ck- Observe that for a tree, mi + • ■ ■ + mi < i — 1 for i — 1, • • • , k so that 
all integrals are well-defined and finite for real non-negative a. It is amusing 
to note also that the contribution of a single tree of size k can contain poles 
in a at values that are not in the list — 1, —1/2, • ■ ■ , —1/k. These poles have 
to cancel between different trees in the sum over trees of size k, because we 
know that they are absent in Ck- But we have no simple explanation for this 
cancellation. 

This explicit formula makes it easy to show that C(z) is (complex) ana- 
lytic in a in a neighborhood of ]0, +oo[ for every z such that \z\ < 1. Indeed, 
we know that for a e]0, +oo[, and \z\ < 1 the series J2k Pk(a)z k is absolutely 
convergent. But Pk(a) is a sum of non-negative contributions, each tree giv- 
ing a contribution of the form (|44j) . Now suppose that a is complex with 
positive real part. For each tree contribution and for fixed §la, the modulus 
of 

e -ka 

(a + 1 — mi) (2a + 2 — mi — m 2 ) ■ ■ ■ (ka + k — mi — ■ ■ ■ — m^) 

is maximal when ^a = and then the expression is real and positive. This is 
because the statement is true for every factor. Taking the sum over trees we 
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infer that P k (a) < P k (dta) (J^) " . So the series J2k Pk(<^)z k is absolutely 
convergent if \zot\ < 3?a. This equation defines, for fixed \z\ < 1, a neigh- 
borhood of a g]0,+oo[ in which we have an absolutely convergent sum of 
analytic functions analytic of a. Hence the sum is analytic in a as claimed. 

To resum more explicitly the contribution of all trees of a given size, we 
need the generating function for labeled trees with given incoming degrees. 
Suppose more generally that we give a weight Xi for each edge leaving vertex 
i (i.e. connecting % to a j > i) and a weight yi for each edge entering vertex 
i (i.e. connecting i to a j < i). The generating function T for weighted trees 
on n vertices factorizes nicely as 

T = xi(y 2 xi + y 2 x 2 + 1/3X2 H \- ykX 2 ) 

{V3X1 + y 3 x 2 + y 3 x 3 + 2/4^3 H 1- y n x 3 ) ■ ■ ■ 

(yn-ixi H h y n -\x n -\ + y n Xn-i)y n - 

This generalization of the famous Caley tree formula 2 implies it immediately. 
It seems to be little known, although it is implicit in the mathematical litera- 
ture El- Gilles Schaeffer [TO] provided us with a clean proof using a refined 
version of one of the standard proofs of the Caley tree formula, putting trees 
on k vertices in one to one correspondence with applications from [1, • • • , k] 
to [1, • • • , k] fixing 1 and k, see e.g. [TT] . 

This formula can be specialized to Xi = 1 and = for i = 1, • • • , k 
to give 

C k = e- ka a k ~ l I da l ---da k {a l ---a k ) a (45) 

,2 1 1 . . 3 1 1 . ,k-l 1,1 
(- + - + ••• + —)(- + - + ••• + —)••• ( + — )— . 

0"2 03 crk 0-3 cr 4 a k Ok-\ Ok 

If one integrates only over a subset of the cr's, one gets marginal distri- 
butions. For instance, for k — 1, we get that in the thermodynamic limit the 
fraction of sites with age close to ta that are isolated is e~ a a a . For k = 2, 
if we integrate over a 2 , we get that the fraction of sites with age close to ta 
that are the older vertex of a tree on two vertices is e~ 2a (a a — cx 2a ) while 
if we integrate over a%, we get that the fraction of sites with age close to 
ta that are the younger vertex of a tree on two vertices is e~ 2a a 2a -^-^. The 
sum, e~ 2a (a a — o" 2 "^-), gives the probability that a site with age close to 
ta belongs to a tree on 2 vertices. 

Our explicit representation in terms of trees shows that in this model, and 
at least for questions concerning connected components, the thermodynamic 

2 Which states that there are n™~ 2 labeled trees on n vertices. 
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limit applies not only to the full system, but also to slices of fixed relative 
age a = t'/t. In the next section we shall study a-dependent profiles. 

For small fe's, we have done all the integrals and checked the agreement 
with the value of Ck obtained by the recursion relation. But a general proof 
valid for all fc's is lacking. 



8.2 Dating finite components. 

To illustrate the evolving nature of our model, we now determine the local-in- 
time distribution of the cluster sizes. This means determining for any given 
age interval what is the proportion of vertices of these ages which belong to 
clusters of given size. 

Define Pk{t, t') to be the probability that vertex t' belongs to a component 
of size k at time t. Guided by the previous tree representation, we infer that 
in the thermodynamical limit Y^t'e[ta,t(cr+da)] Pk(t, t') — tp k (a)da, with Pfc(er) a 
deterministic function. By construction, Jq pk(a)da = kCk, the total fraction 
of points that belong to components of size k. 

The reasoning leading to (fTTj) can be generalized: one writes down a 
recursion relation for p k (t + 1, t') and then takes the average, a step justified 
by the explicit tree representation. The event that vertex t' belongs to a 
component of size k at time t + 1 is the exclusive union of several events. 

i) Vertex t' belonged to a component of size k at time t and this component 
is not linked to the new vertex t + 1. This has probability Pk{t, t')q^ +1 . 

ii) Vertex t' belonged to a component of size I < k at time t, this compo- 
nent is linked to the new vertex t+1, and together with the other components 
linked to t + 1 (say n m (t) components of size m), it builds a component of 
size k = I + 1 + J2 m mn m(t)- This has probability 

MM'Xi - 4i) n ( Nmit) M 9 5f-w-^-^w) ( i - qT+1 )^) 

To perform the explicit sum over I and the n m (t)'s, we introduce again gen- 
erating functions and set Pt,t'{z) — HiPi{t^') zl ■ This leads to 

Pt+i,t'{z) = p t ,t'{zqt+i) 

\ I Ht+l T Z \ L — Ht+l! / m 

In the large t limit this complicated formula simplifies if we use again the 
hypothesis of self-averaging and asymptotic independence. Defining 

p(a,z)=J2pk(v)z\ (46) 

k 



25 



this leads to 

ad a p = a{\ - ze- a+az9 * c )zd z p. 

Together with the sum rule relating p to C, this fixes completely the 
profiles pk{cr). A relation between p(l, z) and C(z) is obtained by integrating 
(|8.2jl for a between and 1. Using the defining equation for C(z), eq.tJTTJ), this 
leads to p(l, z) = (1 + azd z )C(z). Thus, we can summarize our knowledge 
on component time profiles with the four relations : 

ad a p(cr,z) = ar(l - p(l, z))zd z p(a, z) 

f dap(a,z) = zd z C(z) (47) 
Jo 

p(l,z) = (l + azd z )C(z). 

Expansion in powers of z leads to recursion relations for the pfc(cr)'s. 
They can be shown to be alternating polynomials of degree k in x = cr a with 
vanishing constant coefficient. The first few polynomials are 



pi = e x 

P2 = e- 2a (x- 



x 2 



a + 1 



_ 3a ( 3a + 2 2 2 3a + 2 

P3 = e 6 \ — — x -x + 



2(a + l) a + 1 2(a + l) 2 (2a + 1) 

Again, we have checked that for small fc's the values of pk as computed 
from iterated tree integrals and from the generating function coincide, but 
we have no general proof. 

As an application, let us look at the profile of vertices of relative age close 
to a which are the youngest in components of size k. These are vertices that 
created, when they appeared, a component of size k - this happens with 
probability puiX) = (1 + ka)Ck - which was then left untouched for the rest 
of the evolution - this happens with probability a ka . So the distribution of 
vertices that are the youngest in their connected (finite) component is 

£(1 + ka)C k a ka = (1 + azd z )C(a a ). (48) 

k 

In this expression, the formal parameter z has been replaced by a a . 

The giant component, when it exists, also has a well-defined profile to 
which we turn now. 
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8.3 Dating the percolating cluster. 

We now determine the profile of the giant component. 

So let Poo(cr)t da be the fraction of vertices whose ages are between at and 
(a + da )t which belong to the giant component. By definition Jq 1 da Poo(a) = 
Poo. We know from the tree representation that for finite clusters the ther- 
modynamic limit applies not only on the full graph, but also on time slices. 
The giant component is the complement of finite components, so we expect 
that the density poo (a) is self-averaging as is the size of the percolating clus- 
ter. To derive an equation fixing this density we look for the probability for 
a site of age j = at not to be in the percolating cluster. On the one hand, 
by definition of the density this probability is 

Prob(j = at [kgo]) = 1 - peo{<?) 

On the other hand, this probability may be evaluated by demanding the ver- 
tex j not to be connected to the older and younger vertices of the percolating 
cluster: 

(y ry 

Prob(j=a^M)= (l-f) II (1-t) 

k<j,ke[koo] 3 k>j,ke[koo] 



At large time, the first above product converges to exp(— a J CT d( Pa °^ ) and 

x>(< 

c 



the second to exp(— a d( Po °>0 ) . So we get the relation: 



l-Poo(o-) = exp I —a d( — aj dC, — - — I 

= exp (-<xf Q dCpaoiOmm (49) 

As poo is positive, Jq 1 d( Poo(C) m i n (h > ^) is an increasing function of a. So 
Poo (a) is decreasing and has a right limit at 0. Unless poo = (i.e. a < 1/4), 
the integral has a logarithmic divergence, and henceforth poo(0) = 1. More 
precisely, 

Poo(a) = 1 - a a exp (a £ d( 1 ~ ^° (C) ^ + • ■ ■ (50) 

This means that the early vertices belong to the giant component with prob- 
ability 1. 

On the other hand, by definition Jq 1 d( poo (C) = Poo so taking a = 1 in 
(HI we get that 

Poo(l) = l-e- aP ~. (51) 
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This means that the late vertices always belong to the giant component with a 
non vanishing probability - although this probability is exponentially small 
close to the threshold. Hence the giant component invades all time slices 
above the threshold, and the term percolation transition is appropriate even 
with this unusual interpretation. Our results are illustrated on FigEl 

2000 



1500 



1000 



500 





2000 4000 6000 8000 10000 

Figure 2: The analytic result (solid lines) for the profile of the giant compo- 
nent compared to numerical simulations ( gray clouds ) on 2000 random graphs 
of size 10000. From top to bottom, the values of a are 1, 1/2 and 1/3. 

To conclude this discussion, let us observe that can be expressed in 
terms of the profiles of finite components that we studied before: 

Paoia) = 1 - Pk( a ) = 1 - PO> Z=l) 
k 

with p(a, z) defined in eq. pfij) . We end up with yet another equation con- 
straining p(a, z), 

p{a, z = 1) = a a exp ^— a + d( Poo(C) min 

If we expand p(a, z — 1) in powers of a na as p(a, z — 1) = J2k>i D n {l + 
na)o~ na and define D(z) = J2k>i ^t 2 " 1 ' we can check that 

zd z D(z) + a(zd z ) 2 D(z) = zexp(-a - D(z) + D(l) + zd z D(l)) (52) 
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In particular, by differentiation this implies that 



(n - 1)(1 + na)D, 



n 



E * 2 



(1 + al)D k D t 



(53) 



k+l=n 



The resemblance with equations (fTTj) and (jI4j) for the function C(z) is rather 
striking, but we have not been able to use this more deeply. One difference 
is that (fTTj) determines C\ from scratch, whereas (JH2J) does it in a two step 
process. If d n is the solution of this equation satisfying d± = 1, then D n = 
d n Di, and then Di has to satisfy 



Another difference is that the sequence D n alternates in sign. Again, the 
formal parameter z receives a simple physical interpretation z = a a where a 
labels the relative date of birth of vertices. 

9 Conclusions 

In this study, we have described detailed global and local-in-time features of 
evolving random graphs with uniform attachment rules. 

Concerning global properties, we have shown that the model has a perco- 
lation phase transition at a = 1/4. Below the transition, the system contains 
clusters whose sizes scale like t^-Vi- 4a)/2_ Above the transition, a single com- 
ponent, the giant component, grows steadily with time. We have shown that, 
close to the threshold, the fraction of sites in the giant component has an 
essential singularity and behaves as e _7r//v/4a_1 . The behaviors below and 
above the transitions are strongly reminiscent of the two dimensional XY 
model, so that our model can be interpreted as some algorithmic equivalent 
of it, but it seems unlikely that a direct connexion exists. This analogy and 
further scaling properties we present call for an alternative renormalization 
group approach to the transition. 

By describing local-in-time profiles, we have shown that they offer a more 
accurate vision of the specificities of evolving graphs. It would be desirable to 
generalize this approach by answering the following question : assume that 
a procedure to assign ages to the vertices of some evolving graphs has been 
given, what informations on the microscopic evolution rules of these graphs 
can be decoded from the knowledge of local-in-time statistics ? 

Aknowledgements: We thank Gilles Schaeffer for his clarifying help in tree 
generating functions and Sergei Dorogovtsev for his comments and interest. 
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10 Appendix A. 

We show how our arguments of section H] can be modified to describe the 
Erdos-Renyi random graph model. In this model, one starts with n points, 
and any two points are connected by an edge with probability a/n (so in 
this model all the points are equivalent). Then a limit n — > oo is taken. 
This famous model describes a static graph, but it can also be rephrased 
as an evolving graph in the following way : set t = n and suppose that 
points are added one by one, from 1 to t, each new point connecting to any 
previous one with probability a/t. From this point of view, it can be seen 
that looking only at the first t' vertices (t' < t) amount to look at an Erdos- 
Renyi random graph with a modified connectivity parameter a 1 = at' jt. To 
get a recursion relation, we start from an Erdos-Renyi random graph of size 
t with connectivity parameter a. We add vertex t + 1 and connect any older 
vertex to it with probability q t+ i = a/t, so that the effective connectivity 
parameter for the graph on t + 1 vertices is a(t + l)/t. Then the derivation 
proceeds as before, with the little proviso that in eq.(JTUJ), on the left-hand 
side (Nt+i(z)) has an effective connectivity parameter a(t + l)/t instead of 
a. So when we take the thermodynamic limit, nothing changes on the right- 
hand side of (fTUjl . but an additional term ad a contributes to the left-hand 
side. If c denotes the analog of C but for the Erdos-Renyi random graph, 
then 

ad a c = — c — azd z c + zexp(— a + azd z c). 

This little modification in the equation has drastic consequences. The 
new equation has a single solution regular at a = 0, namely 

Kk-2 

which is the well-known result. We conclude that our self- averaging hypoth- 
esis is valid for the Erdos-Renyi model. This makes it more plausible that 
it works for our original model as well, a fact also confirmed by numerical 
simulations. 

11 Appendix B. 

Our goal is to prove the following formula and to investigate a few of its 
consequences. We claim that 

k>l 7>1 s m>± 
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where the contour integral is around the origin. It is a Fokker-Planck equa- 
tion for the Markov process formed by the A^(i)'s, which could be used 
to prove systematically that the variables Cf. are self-averaging, a task we 
perform for C\ and C2 at the end of this appendix. 
We start from eq.© which may be rewritten as 

/TT Nk(t+1)\ /TT NAt)~nAt) 5 r. 1 +J2p prl p'- t \ 
(Il W k } = (U W j W j ) 

k j 

To compute the last term we insert the tautological identity 



1 - zl <W+£ pn P (t) 
k>l 



in the r.h.s. to get 



<n < i(,+1, > = e «* <n »f ,(,, ~" ,( " h» +Ernm ) 

k k>l j 

To compute the r.h.s. expectation value we use a contour integral represen- 
tation of the Kronecker symbol 

<W,pn,(«) = % ^ 

This yields 

k k>i j ° ztn \^ 1 p 

The r.h.s. can now be computed using eq.(jHJ) and gives eq. (|54j) . 

This computation has a rather simple combinatorial reinterpretation. We 

set 

k 

and observe that going from time t to t + 1, we add vertex t + 1 and edges 
from the rest of the graph to t + 1. Suppose that the component of vertex 
t+1 has size say fc. This component was build by "eating" some components 
of the graph at time t. A component of size j is swallowed with probability 
1 — q ] t+l and survives with probability ql+i- On the other hand, if one expands 

F t (wiq t+ i + f (1 - ft+i), w 2 q 2 t+l + £ 2 (1 - g t 2 + i), • • ■) 
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in powers of £, the term of degree k — 1 enumerates all the possibilities to 
build the component of vertex t + 1 with the correct probability. Defining 
Wi(£) = wiq\ +1 + £ z (l — q l t+ x), summation over k gives 

F t+ i( Wl , w 2 ,---) = j2wk<f ^r k F t (wm,MO, • • •), 

^ Jo 2m 

as obtained previously. 

The Fokker-Planck eq. (|54j) may also be formulated as a difference equa- 
tion, similar to a discrete Schrodinger equation. Let us for instance specify 
it for Wk = J2j Cj with Q, j — 0, • • • , M, a set of complex numbers and let 

^(Co,---,CM) = (n[Co+---+c^^ w ) 

k 

This parametrization is similar to that used in matrix theory where Wk may 
be thought of as the trace of the k th power of matrix whose eigenvalues are 
the C/s. The contour integral in eq. ()54|) can then be explicitly evaluated 
by deforming the integration contour to pick the simple pole contributions 
located at the points £ — Cj- This gives: 

*t+i(Co, • • • , Cm) = qt+i E %(■■■, Q/Qt+i, ■ ■ ■) (55) 

j 

Eq. (p>4"|) or (Jo7)|) may be used to prove that the numbers Cj. are self- 
averaging. Let us choose for example two parameters Co — 1 an d Ci — z /t- 
Then, only the clusters of size 1 give a non trivial contribution to \l/ t at large 
time so that 

^\z)=%{l,z/t) = ([[(l + z k /t k ) tc >) (e* Cl > 

k 

At large time, the difference equation (}55|) then reduces to a differential 
equation for $?t(z), 

(l + a)dM 1) (z) = e- a ^i 1 \z) 

It implies that log ^ (2) is linear in z at large time which means that C\ is 
self-averaging and stationary at large time. More precisely, integrating the 
above equation gives: 

¥ t l \z) = expizdia)), = 

a + 1 

in agreement with eq. ()12j) . 
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Similarly, to prove that C2 possesses a finite self-averaging limit as t — > 00 
we choose three parameters Co — 1> Ci — — C2 = \Jz/t so that only clusters of 
size two survive in ^> t at large time, 

Eq. ()55|) then gives a first order differential equation for ty[ 2 \z) which implies 
that 

^ 2) (z) = exp(2,C 2 (a)), C 2 (a) = 

[2a + lj(a + 1) 

in agreement with eq. (|12|) . Although we do not have a global argument this 
proof may clearly be extended to recursively prove self-averageness of any Ck 
by choosing the parameters Q = uji with uj k = z/t. 



12 Appendix C. 

Here we present the proof of eq. (|43Jl . Let us first recall Lagrange formula. 
Consider a variable X defined by the implicit relation /(X) = y for some 
given analytic function /. The solution of this equation is supposed to be 
unique so that X is function of y. Given another analytic function g(w) we 
look for the Taylor series in y of g(X(y)). This composed function may be 
presented contour integral: 

Expanding the integrated rational function in Taylor series in y gives La- 
grange formula: 

dz f{z) 



Let H{x) = A(G c (x) + 2x). Eq.(g2J) translates into H(x) log H (x) = -8x 
or W(y)e w ^ = y for y = —8x and H{x) = e w( - y \ We now apply Lagrange 
formula with /(X) = Xe x and g(w) = e w . This gives 



H{y) = Y,y n f 



dz z + 1 (1-^ 



n>0 " Z,LH Z 
n>l 

With H{x) = 4(G c (x) + 2x) and y = —8x, this proves eq. (j4"3"jl . 
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